Submitted to Eurospeech’99, Budapest SPEECH/MUSIC DISCRIMINATION BASED ON POSTERIOR PROBABILITY FEATURES

نویسندگان

Gethin Williams

Daniel P.W. Ellis

چکیده

A hybrid connectionist-HMM speech recognizer uses a neural network acoustic classifier. This network estimates the posterior probability that the acoustic feature vectors at the current time step should be labelled as each of around 50 phone classes. We sought to exploit informal observations of the distinctions in this posterior domain between nonspeech audio and speech segments well-modeled by the network. We describe four statistics that successfully capture these differences, and which can be combined to make a reliable speech/nonspeech categorization that is closely related to the likely performance of the speech recognizer. We test these features on a database of speech/music examples, and our results match the previously-reported classification error, based on a variety of special-purpose features, of 1.4% for 2.5 second segments. We also show that recognizing segments ordered according to their resemblance to clean speech can result in an error rate close to the ideal minimum over all such subsetting strategies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech/music discrimination based on posterior probability features

متن کامل

A Sphinx Based Speech-music Segmentation Front-end for Improving the Performance of an Automatic Speech Recognition System in Turkish

In this study a system that segments an audio signal as speech and music by using posterior probability based features is proposed and implemented in Sphinx. Unlike the earlier efforts that uses Multi-Layer Perceptrons (MLP), this system uses Hidden-MarkovModel based acoustic models that are trained in Sphinx for posterior probability calculations. Acoustic Models are trained with the HMM-state...

متن کامل

Experiments on Speech/Music Discrimination

The problem of speech/music discrimination has become increasingly important as automatic speech recognition system are applied to more real-world multimedia domains. One of the issue in the design of a signal classifier is the selection of an appropriate feature set that captures the temporal and spectral structures of the signal. Many features have been used in speech/music discrimination. Th...

متن کامل

Submitted to Eurospeech’99, Budapest MULTI-STREAM SPEECH RECOGNITION: READY FOR PRIME TIME?

Multi-stream and multi-band methods can improve the accuracy of speech recognition systems without overly increasing the complexity. However, they cannot be applied blindly. In this paper, we review our experience applying multi-stream and multiband methods to the Broadcast News corpus. We found that multi-stream systems using different acoustic front-ends provide a significant improvement over...

متن کامل

Feature fusion for music detection

Automatic discrimination between music, speech and noise has grown in importance as a research topic over recent years. The need to classify audio into categories such as music or speech is an important part of the multimedia document retrieval problem. This paper extends work previously carried out by the authors which compared performance of static and transitional features based on cepstra, ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Submitted to Eurospeech’99, Budapest SPEECH/MUSIC DISCRIMINATION BASED ON POSTERIOR PROBABILITY FEATURES

نویسندگان

چکیده

منابع مشابه

Speech/music discrimination based on posterior probability features

A Sphinx Based Speech-music Segmentation Front-end for Improving the Performance of an Automatic Speech Recognition System in Turkish

Experiments on Speech/Music Discrimination

Submitted to Eurospeech’99, Budapest MULTI-STREAM SPEECH RECOGNITION: READY FOR PRIME TIME?

Feature fusion for music detection

عنوان ژورنال:

اشتراک گذاری